DiscoverHuggingFace 每日AI论文速递2025.09.26 | SciReasoner八项全能;MMR1模糊区炼出开源多模态
2025.09.26 | SciReasoner八项全能;MMR1模糊区炼出开源多模态

2025.09.26 | SciReasoner八项全能;MMR1模糊区炼出开源多模态

Update: 2025-09-26
Share

Description

本期的 15 篇论文如下:

[00:20 ] 🔬 SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines(SciReasoner:跨学科夯实科学推理基石)

[01:00 ] 🧠 MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources(MMR1:基于方差感知采样与开放资源的多模态推理增强)

[01:41 ] 📈 VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models(VCRL:面向大语言模型的方差驱动课程强化学习)

[02:26 ] 🌳 Tree Search for LLM Agent Reinforcement Learning(基于树搜索的大语言模型智能体强化学习)

[03:06 ] 🖼 Seedream 4.0: Toward Next-generation Multimodal Image Generation(Seedream 4.0:面向下一代多模态图像生成)

[03:40 ] 🎯 Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets(Hunyuan3D-Omni:统一可控3D资产生成框架)

[04:29 ] 🤖 AutoIntent: AutoML for Text Classification(AutoIntent:面向文本分类任务的自动化机器学习框架)

[05:10 ] ⚖ TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them(TrustJudge:LLM-as-a-Judge的评分不一致性及缓解之道)

[05:43 ] 🎢 CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning(CE-GPPO:通过梯度保留裁剪策略优化控制强化学习中的熵)

[06:30 ] 🖼 Does FLUX Already Know How to Perform Physically Plausible Image Composition?(FLUX已掌握物理可信图像合成?)

[07:31 ] ✂ CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling(CHARM:基于控制点的3D动漫发型自回归建模)

[08:26 ] 🧠 Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution(Recon-Act:基于网络侦察、工具生成与任务执行的自我演化多智能体浏览器操作系统)

[09:12 ] 🎮 V-GameGym: Visual Game Generation for Code Large Language Models(V-GameGym:面向代码大模型的视觉游戏生成基准)

[09:49 ] 🗣 Interactive Recommendation Agent with Active User Commands(支持主动用户指令的交互式推荐智能体)

[10:22 ] 🔍 BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback(BESPOKE:基于诊断反馈的搜索增强大模型个性化评测基准)

<figure></figure>

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

2025.09.26 | SciReasoner八项全能;MMR1模糊区炼出开源多模态

2025.09.26 | SciReasoner八项全能;MMR1模糊区炼出开源多模态